现代机器学习研究依赖于相对较少的精心策划数据集。即使在这些数据集中,通常在“不整合”或原始数据中,从业人员也面临着重要的数据质量和多样性问题,这些问题可能会非常强烈地解决。应对这些挑战的现有方法往往会对特定问题做出强烈的假设,并且通常需要先验知识或元数据,例如域标签。我们的工作与这些方法是正交的:相反,我们专注于为元数据考古学提供一个统一和有效的框架 - 在数据集中发现和推断示例的元数据。我们使用简单的转换策划了可能存在的数据集(例如,错误标记,非典型或过度分布示例)中可能存在的数据子集,并利用这些探针套件之间的学习动力学差异来推断感兴趣的元数据。我们的方法与跨不同任务的更复杂的缓解方法相提并论:识别和纠正标签错误的示例,对少数民族样本进行分类,优先考虑与培训相关的点并启用相关示例的可扩展人类审核。
translated by 谷歌翻译
人们普遍认为,人类视觉系统偏向于识别形状而不是纹理。这一假设导致了越来越多的工作,旨在使深层模型的决策过程与人类视野的基本特性保持一致。人们对形状特征的依赖主要预计会改善协变量转移下这些模型的鲁棒性。在本文中,我们重新审视了形状偏置对皮肤病变图像分类的重要性。我们的分析表明,不同的皮肤病变数据集对单个图像特征表现出不同的偏见。有趣的是,尽管深层提取器倾向于学习对皮肤病变分类的纠缠特征,但仍然可以从该纠缠的表示形式中解码单个特征。这表明这些功能仍在模型的学习嵌入空间中表示,但不用于分类。此外,不同数据集的光谱分析表明,与常见的视觉识别相反,皮肤皮肤病变分类本质上依赖于超出形状偏置的复杂特征组合。自然的结果,在某些情况下,摆脱了形状偏见模型的普遍欲望甚至可以改善皮肤病变分类器。
translated by 谷歌翻译
我们介绍了一种新颖的方法,用于使用时间戳监督进行时间戳分割。我们的主要贡献是图形卷积网络,该网络以端到端方式学习,以利用相邻帧之间的帧功能和连接,以从稀疏的时间戳标签中生成密集的框架标签。然后可以使用生成的密集框架标签来训练分割模型。此外,我们为分割模型和图形卷积模型进行交替学习的框架,该模型首先初始化,然后迭代地完善学习模型。在四个公共数据集上进行了详细的实验,包括50种沙拉,GTEA,早餐和桌面组件,表明我们的方法优于多层感知器基线,同时在时间活动中表现出色或更好地表现出色或更好在时间戳监督下。
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.
translated by 谷歌翻译
Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
translated by 谷歌翻译
In this work, we introduce a hypergraph representation learning framework called Hypergraph Neural Networks (HNN) that jointly learns hyperedge embeddings along with a set of hyperedge-dependent embeddings for each node in the hypergraph. HNN derives multiple embeddings per node in the hypergraph where each embedding for a node is dependent on a specific hyperedge of that node. Notably, HNN is accurate, data-efficient, flexible with many interchangeable components, and useful for a wide range of hypergraph learning tasks. We evaluate the effectiveness of the HNN framework for hyperedge prediction and hypergraph node classification. We find that HNN achieves an overall mean gain of 7.72% and 11.37% across all baseline models and graphs for hyperedge prediction and hypergraph node classification, respectively.
translated by 谷歌翻译
A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.
translated by 谷歌翻译
Increasing popularity of deep-learning-powered applications raises the issue of vulnerability of neural networks to adversarial attacks. In other words, hardly perceptible changes in input data lead to the output error in neural network hindering their utilization in applications that involve decisions with security risks. A number of previous works have already thoroughly evaluated the most commonly used configuration - Convolutional Neural Networks (CNNs) against different types of adversarial attacks. Moreover, recent works demonstrated transferability of the some adversarial examples across different neural network models. This paper studied robustness of the new emerging models such as SpinalNet-based neural networks and Compact Convolutional Transformers (CCT) on image classification problem of CIFAR-10 dataset. Each architecture was tested against four White-box attacks and three Black-box attacks. Unlike VGG and SpinalNet models, attention-based CCT configuration demonstrated large span between strong robustness and vulnerability to adversarial examples. Eventually, the study of transferability between VGG, VGG-inspired SpinalNet and pretrained CCT 7/3x1 models was conducted. It was shown that despite high effectiveness of the attack on the certain individual model, this does not guarantee the transferability to other models.
translated by 谷歌翻译
Human Activity Recognition (HAR) is an emerging technology with several applications in surveillance, security, and healthcare sectors. Noninvasive HAR systems based on Wi-Fi Channel State Information (CSI) signals can be developed leveraging the quick growth of ubiquitous Wi-Fi technologies, and the correlation between CSI dynamics and body motions. In this paper, we propose Principal Component-based Wavelet Convolutional Neural Network (or PCWCNN) -- a novel approach that offers robustness and efficiency for practical real-time applications. Our proposed method incorporates two efficient preprocessing algorithms -- the Principal Component Analysis (PCA) and the Discrete Wavelet Transform (DWT). We employ an adaptive activity segmentation algorithm that is accurate and computationally light. Additionally, we used the Wavelet CNN for classification, which is a deep convolutional network analogous to the well-studied ResNet and DenseNet networks. We empirically show that our proposed PCWCNN model performs very well on a real dataset, outperforming existing approaches.
translated by 谷歌翻译